智能论文笔记

Mutual Visibility by Fat Robots with Slim Omnidirectional Camera

Kaustav Bose , Abhinav Chakraborty , Krishnendu Mukhopadhyaya

分类：机器人

2022-06-15

在自动机器人群的现有文献中，采用的可见性模型具有一些与实际传感设备实现不符的理想主义假设。本文在更现实的可见性模型中调查了这个问题，称为不透明的脂肪机器人，具有纤细的全向相机。机器人被建模为单位磁盘，每个磁盘都具有全向摄像头，表示为尺寸较小的磁盘。我们假设机器人具有指南针，可以在其局部坐标系统的两个轴方向和方向上达成共识。机器人配备了可见的灯光，这些灯光是通信的媒介，也可以用作记忆的形式。我们为相互可见性问题提供了分布式算法，该算法在半同步设置中被证明是正确的。我们的算法还为领导者选举提供了解决方案，我们将其用作主要算法中的子例程。尽管在完整的可见性模型中，领导者选举在两个轴心协议中是微不足道的，但在我们的案例中，这是具有挑战性的，并且具有独立的利益。

translated by 谷歌翻译

Self-Activating Neural Ensembles for Continual Reinforcement Learning

Sam Powers , Eliot Xing , Abhinav Gupta

分类：机器学习 | 人工智能

2022-12-31

The ability for an agent to continuously learn new skills without catastrophically forgetting existing knowledge is of critical importance for the development of generally intelligent agents. Most methods devised to address this problem depend heavily on well-defined task boundaries, and thus depend on human supervision. Our task-agnostic method, Self-Activating Neural Ensembles (SANE), uses a modular architecture designed to avoid catastrophic forgetting without making any such assumptions. At the beginning of each trajectory, a module in the SANE ensemble is activated to determine the agent's next policy. During training, new modules are created as needed and only activated modules are updated to ensure that unused modules remain unchanged. This system enables our method to retain and leverage old skills, while growing and learning new ones. We demonstrate our approach on visually rich procedurally generated environments.

translated by 谷歌翻译

NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling

Shishira R Maiya , Sharath Girish , Max Ehrlich , Hanyu Wang , Kwot Sin Lee , Patrick Poirson , Pengxiang Wu , Chen Wang , Abhinav Shrivastava

分类：计算机视觉

2022-12-30

Implicit Neural Representations (INR) have recently shown to be powerful tool for high-quality video compression. However, existing works are limiting as they do not explicitly exploit the temporal redundancy in videos, leading to a long encoding time. Additionally, these methods have fixed architectures which do not scale to longer videos or higher resolutions. To address these issues, we propose NIRVANA, which treats videos as groups of frames and fits separate networks to each group performing patch-wise prediction. This design shares computation within each group, in the spatial and temporal dimensions, resulting in reduced encoding time of the video. The video representation is modeled autoregressively, with networks fit on a current group initialized using weights from the previous group's model. To further enhance efficiency, we perform quantization of the network parameters during training, requiring no post-hoc pruning or quantization. When compared with previous works on the benchmark UVG dataset, NIRVANA improves encoding quality from 37.36 to 37.70 (in terms of PSNR) and the encoding speed by 12X, while maintaining the same compression rate. In contrast to prior video INR works which struggle with larger resolution and longer videos, we show that our algorithm is highly flexible and scales naturally due to its patch-wise and autoregressive designs. Moreover, our method achieves variable bitrate compression by adapting to videos with varying inter-frame motion. NIRVANA achieves 6X decoding speed and scales well with more GPUs, making it practical for various deployment scenarios.

translated by 谷歌翻译

AnyTOD: A Programmable Task-Oriented Dialog System

Jeffrey Zhao , Yuan Cao , Raghav Gupta , Harrison Lee , Abhinav Rastogi , Mingqiu Wang , Hagen Soltau , Izhak Shafran , Yonghui Wu

分类：自然语言处理

2022-12-20

We propose AnyTOD, an end-to-end task-oriented dialog (TOD) system with zero-shot capability for unseen tasks. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer in the form of a schema. To enable generalization onto unseen schemas and programs without prior training, AnyTOD adopts a neuro-symbolic approach. A neural LM keeps track of events that occur during a conversation, and a symbolic program implementing the dialog policy is executed to recommend next actions AnyTOD should take. This approach drastically reduces data annotation and model training requirements, addressing a long-standing challenge in TOD research: rapidly adapting a TOD system to unseen tasks and domains. We demonstrate state-of-the-art results on the STAR and ABCD benchmarks, as well as AnyTOD's strong zero-shot transfer capability in low-resource settings. In addition, we release STARv2, an updated version of the STAR dataset with richer data annotations, for benchmarking zero-shot end-to-end TOD models.

translated by 谷歌翻译

Probabilistic machine learning based predictive and interpretable digital twin for dynamical systems

Tapas Tripura , Aarya Sheetal Desai , Sondipon Adhikari , Souvik Chakraborty

分类： (统计)机器学习 | 机器学习

2022-12-19

A framework for creating and updating digital twins for dynamical systems from a library of physics-based functions is proposed. The sparse Bayesian machine learning is used to update and derive an interpretable expression for the digital twin. Two approaches for updating the digital twin are proposed. The first approach makes use of both the input and output information from a dynamical system, whereas the second approach utilizes output-only observations to update the digital twin. Both methods use a library of candidate functions representing certain physics to infer new perturbation terms in the existing digital twin model. In both cases, the resulting expressions of updated digital twins are identical, and in addition, the epistemic uncertainties are quantified. In the first approach, the regression problem is derived from a state-space model, whereas in the latter case, the output-only information is treated as a stochastic process. The concepts of It\^o calculus and Kramers-Moyal expansion are being utilized to derive the regression equation. The performance of the proposed approaches is demonstrated using highly nonlinear dynamical systems such as the crack-degradation problem. Numerical results demonstrated in this paper almost exactly identify the correct perturbation terms along with their associated parameters in the dynamical system. The probabilistic nature of the proposed approach also helps in quantifying the uncertainties associated with updated models. The proposed approaches provide an exact and explainable description of the perturbations in digital twin models, which can be directly used for better cyber-physical integration, long-term future predictions, degradation monitoring, and model-agnostic control.

translated by 谷歌翻译

Speech Aware Dialog System Technology Challenge (DSTC11)

Hagen Soltau , Izhak Shafran , Mingqiu Wang , Abhinav Rastogi , Jeffrey Zhao , Ye Jia , Wei Han , Yuan Cao , Aramys Miranda

分类：人工智能

2022-12-16

Most research on task oriented dialog modeling is based on written text input. However, users interact with practical dialog systems often using speech as input. Typically, systems convert speech into text using an Automatic Speech Recognition (ASR) system, introducing errors. Furthermore, these systems do not address the differences in written and spoken language. The research on this topic is stymied by the lack of a public corpus. Motivated by these considerations, our goal in hosting the speech-aware dialog state tracking challenge was to create a public corpus or task which can be used to investigate the performance gap between the written and spoken forms of input, develop models that could alleviate this gap, and establish whether Text-to-Speech-based (TTS) systems is a reasonable surrogate to the more-labor intensive human data collection. We created three spoken versions of the popular written-domain MultiWoz task -- (a) TTS-Verbatim: written user inputs were converted into speech waveforms using a TTS system, (b) Human-Verbatim: humans spoke the user inputs verbatim, and (c) Human-paraphrased: humans paraphrased the user inputs. Additionally, we provided different forms of ASR output to encourage wider participation from teams that may not have access to state-of-the-art ASR systems. These included ASR transcripts, word time stamps, and latent representations of the audio (audio encoder outputs). In this paper, we describe the corpus, report results from participating teams, provide preliminary analyses of their results, and summarize the current state-of-the-art in this domain.

translated by 谷歌翻译

De-risking Carbon Capture and Sequestration with Explainable CO2 Leakage Detection in Time-lapse Seismic Monitoring Images

Huseyin Tuna Erdinc , Abhinav Prakash Gahlot , Ziyi Yin , Mathias Louboutin , Felix J. Herrmann

分类：人工智能 | 计算机视觉

2022-12-16

With the growing global deployment of carbon capture and sequestration technology to combat climate change, monitoring and detection of potential CO2 leakage through existing or storage induced faults are critical to the safe and long-term viability of the technology. Recent work on time-lapse seismic monitoring of CO2 storage has shown promising results in its ability to monitor the growth of the CO2 plume from surface recorded seismic data. However, due to the low sensitivity of seismic imaging to CO2 concentration, additional developments are required to efficiently interpret the seismic images for leakage. In this work, we introduce a binary classification of time-lapse seismic images to delineate CO2 plumes (leakage) using state-of-the-art deep learning models. Additionally, we localize the leakage region of CO2 plumes by leveraging Class Activation Mapping methods.

translated by 谷歌翻译

An ensemble neural network approach to forecast Dengue outbreak based on climatic condition

Madhurima Panja , Tanujit Chakraborty , Sk Shahid Nadim , Indrajit Ghosh , Uttam Kumar , Nan Liu

分类：机器学习

2022-12-16

Dengue fever is a virulent disease spreading over 100 tropical and subtropical countries in Africa, the Americas, and Asia. This arboviral disease affects around 400 million people globally, severely distressing the healthcare systems. The unavailability of a specific drug and ready-to-use vaccine makes the situation worse. Hence, policymakers must rely on early warning systems to control intervention-related decisions. Forecasts routinely provide critical information for dangerous epidemic events. However, the available forecasting models (e.g., weather-driven mechanistic, statistical time series, and machine learning models) lack a clear understanding of different components to improve prediction accuracy and often provide unstable and unreliable forecasts. This study proposes an ensemble wavelet neural network with exogenous factor(s) (XEWNet) model that can produce reliable estimates for dengue outbreak prediction for three geographical regions, namely San Juan, Iquitos, and Ahmedabad. The proposed XEWNet model is flexible and can easily incorporate exogenous climate variable(s) confirmed by statistical causality tests in its scalable framework. The proposed model is an integrated approach that uses wavelet transformation into an ensemble neural network framework that helps in generating more reliable long-term forecasts. The proposed XEWNet allows complex non-linear relationships between the dengue incidence cases and rainfall; however, mathematically interpretable, fast in execution, and easily comprehensible. The proposal's competitiveness is measured using computational experiments based on various statistical metrics and several statistical comparison tests. In comparison with statistical, machine learning, and deep learning methods, our proposed XEWNet performs better in 75% of the cases for short-term and long-term forecasting of dengue incidence.

translated by 谷歌翻译

MAntRA: A framework for model agnostic reliability analysis

Yogesh Chandrakant Mathpati , Kalpesh Sanjay More , Tapas Tripura , Rajdip Nayek , Souvik Chakraborty

分类：机器学习 | (统计)机器学习

2022-12-13

We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.

translated by 谷歌翻译

A Neural ODE Interpretation of Transformer Layers

Yaofeng Desmond Zhong , Tongtao Zhang , Amit Chakraborty , Biswadip Dey

分类：机器学习 | 人工智能

2022-12-12

Transformer layers, which use an alternating pattern of multi-head attention and multi-layer perceptron (MLP) layers, provide an effective tool for a variety of machine learning problems. As the transformer layers use residual connections to avoid the problem of vanishing gradients, they can be viewed as the numerical integration of a differential equation. In this extended abstract, we build upon this connection and propose a modification of the internal architecture of a transformer layer. The proposed model places the multi-head attention sublayer and the MLP sublayer parallel to each other. Our experiments show that this simple modification improves the performance of transformer networks in multiple tasks. Moreover, for the image classification task, we show that using neural ODE solvers with a sophisticated integration scheme further improves performance.

translated by 谷歌翻译